Scaling SDI Systems via Query Clustering and Aggregation

نویسندگان

  • Xi Zhang
  • Liang Huai Yang
  • Mong-Li Lee
  • Wynne Hsu
چکیده

XML-based Selective Dissemination of Information (SDI) systems aims to quickly deliver useful information to the users based on their profiles or user subscriptions. These subscriptions are specified in the form of XML queries. This paper investigates how clustering and aggregation of user queries can help scale SDI systems by reducing the number of document-subscription matchings required. We design a new distance function to measure the similarity of query patterns, and develop a filtering technique called YFilter* that is based on YFilter. Experiment results show that the proposed approach is able to achieve high precision, high recall, while reducing runtime requirement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying OLAP Pre-Aggregation Techniques to Speed Up Aggregate Query Processing in Array Databases

Large multidimensional arrays of data are common in a variety of scientific applications. In the past, arrays have typically been stored in files, and then manipulated by customized programs operating on those files. Nowadays, with science moving toward computational databases, the trend is toward a new class of database, the array database. In the broadest sense, the array database supports va...

متن کامل

Cost-based query-adaptive clustering for multidimensional objects with spatial extents. (Groupement d'Objets Multidimensionnels Etendus avec un Modèle de Coût Adaptatif aux Requêtes)

We propose a cost-based query-adaptive clustering solution for multidimen-sional objects with spatial extents to speed-up execution of spatial range queries (e.g.,intersection, containment). Our work was motivated by the emergence of many SDIapplications (Selective Dissemination of Information) bringing out new real challengesfor the multidimensional data indexing. Our clusterin...

متن کامل

An efficient peer-to-peer indexing tree structure for multidimensional data

As one of the most important technology for implementing large-scale distributed systems, peer-to-peer (P2P) computing has attracted much attention in both research and academia communities, for its advantages such as high availability, high performance, and high flexibility to the dynamics of networks. However, multidimensional data indexing remains as a big challenge to P2P computing, because...

متن کامل

Analysis of User query refinement behavior based on semantic features: user log analysis of Ganj database (IranDoc)

Background and Aim: Information systems cannot be well designed or developed without a clear understanding of needs of users, manner of their information seeking and evaluating. This research has been designed to analyze the Ganj (Iranian research institute of science and technology database) users’ query refinement behaviors via log analysis.    Methods: The method of this research is log anal...

متن کامل

An Efficient Data Aggregation and Query Optimization for Energy Efficiency in Wireless Sensor Network

Wireless Sensor Network (WSN), a dispersed independent devices used to sense the physical or environmental circumstances which comprises of collection of discrete sensors for observing and classifying the collected data. The data aggregation is a method of collecting the data from sensor nodes and monitors the information obtained from the sensors. Data aggregation minimizes the traffic in netw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004